Search results for "ZIPF'S LAW"
showing 6 items of 6 documents
Numerical Analysis of Word Frequencies in Artificial and Natural Language Texts
1997
We perform a numerical study of the statistical properties of natural texts written in English and of two types of artificial texts. As statistical tools we use the conventional Zipf analysis of the distribution of words and the inverse Zipf analysis of the distribution of frequencies of words, the analysis of vocabulary growth, the Shannon entropy and a quantity which is a nonlinear function of frequencies of words, the frequency "entropy". Our numerical results, obtained by investigation of eight complete books and sixteen related artificial texts, suggest that, among these analyses, the analysis of vocabulary growth shows the most striking difference between natural and artificial texts…
Zipf’s Law and World Income Distribution
2008
The aim of this article is to demonstrate regularity in the world income distribution. In particular, using GDP per capita data for the period 1980 to 2004, the article shows that the world income distribution follows the well know 'rank-size rule'.
Comparison of MeSH terms and KeyWords Plus terms for more accurate classification in medical research fields. A case study in cannabis research
2021
Abstract KeyWords Plus and Medical Subject Headings (MeSH) are widely used in bibliometric studies for topic mapping. The objective of this study is to compare the two description systems in documents about cannabis research to find the concordance between systems and establish whether there is neutrality in topic mapping. A total of 25,593 articles from 1970 to 2019 were drawn from Web of Science's Core Collection and Medline and analyzed. The tidytext library, Zipf's law, topic modeling tools, the contingency coefficient, Cramer's V, and Cohen's kappa were used. The results included 10,107 MeSH terms and 28,870 KeyWords Plus terms. The Zipf distribution of the terms was different for each…
Pareto or log-normal? A recursive-truncation approach to the distribution of (all) cities
2012
Traditionally, it is assumed that the population size of cities in a country follows a Pareto distribution. This assumption is typically supported by finding evidence of Zipf's Law. Recent studies question this finding, highlighting that, while the Pareto distribution may fit reasonably well when the data is truncated at the upper tail, i.e. for the largest cities of a country, the log-normal distribution may apply when all cities are considered. Moreover, conclusions may be sensitive to the choice of a particular truncation threshold, a yet overlooked issue in the literature. In this paper, then, we reassess the city size distribution in relation to its sensitivity to the choice of truncat…
The 'power' of tourism in Portugal
2012
The author analyses the upper tail of the distribution of tourism supply in Portugal from 2002 to 2009, using data from the Instituto Nacional de Estatística database. Tourism supply is defined in terms of the lodging capacity of hotel establishments in about 250 tourist destinations. The paper shows that the empirical distribution of tourism supply in Portugal is heavy-tailed and consistent with a power law behaviour in its upper tail. Such behaviour seems to be stable over the years, provided that, for the time horizon covered by the data sets, the scaling parameter is always close to the value of two. The power law hypothesis is tested positively through the use of graphical and analyti…
Analiza zależności pomiędzy pozycją w rankingu "Diamenty Forbesa" a wzrostem wartości przedsiębiorstwa
2016
Position of companies in various rankings is dependent on many factors, often it is difficult to satisfy all requirements at the same time that the occupied position was one of the best. The analysis of the size of enter- prises in various countries confirms the trend: there is a lot of the small enterprises, and the big ones are the minority. Considering the rankings of firms by size of assets Zipf justified the relationship between the position in the ranking and the size of these assets. The aim of the article was to determine the relationship between the growth of the company and the position in the ranking. Studies have shown that it is possible to describe the dependence of growth ent…